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Abstract 

Solving elliptic PDEs in more than one dimension can be a computationally ex- 
pensive task. For some applications characterised by a high degree of anisotropy 
in the coefficients of the elliptic operator, such that the term with the highest 
derivative in one direction is much larger than the terms in the remaining direc- 
tions, the discretized elliptic operator often has a very large condition number 
- taking the solution even further out of reach using traditional methods. This 
paper will demonstrate a solution method for such ill-behaved problems. The 
high condition number of the Z3-dimensional discretized elliptic operator will be 
exploited to split the problem into a series of well-behaved one and {D — 1)- 
dimensional elliptic problems. This solution technique can be used alone on 
sufficiently coarse grids, or in conjunction with standard iterative methods, such 
as Conjugate Gradient, to substantially reduce the number of iterations needed 
to solve the problem to a specified accuracy. The solution is formulated analyt- 
ically for a generic anisotropic problem using arbitrary coordinates, hopefully 
bringing this method into the scope of a wide variety of applications. 
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1. Introduction 



The solution of Poisson problems is often the single most expensive step in 
the numerical solution of partial differential equations (PDEs). For example, 
when solving the Navier-Stokes or Euler equations, the Poisson problem arises 
from the incompressibility condition IJ. The particular solution strategy 
depends of course on a combination of factors, including the specific choice of 
the discretization and the type of boundary conditions. In simple geometries, 
very efficient schemes can be devised to reduce the effective dimensionality of 
the problem, such as using FFTs or cyclic reduction to partially or completely 
diagonalize the operator [2]. For more complex geometries or boundary con- 
ditions, the available choices to solve the discretized problem usually involves 
direct inversion for small size problems, while Krylov based iterative methods 
such as Conjugate Gradient or multigrid methods are used when the matrix 
problem is too large to be inverted exactly 0, 3, . 

In this paper, we are interested in Poisson problems characterized by a high 
level of anisotropy (to be precisely defined later). The source of anisotropy can 
be due to the highly flattened domains over which the solution is sought, as is 
the case for atmospheric, oceanic \^ and some astrophysical problems P, P- 77]. 
However, the source of anisotropy could have a physical base, e.g. Non-Fickian 
diffusion problems where the flux is related to the gradient by an anisotropic 
tensor ^ . Recently, research has been conducted to develop compound materi- 
als that could serve as a cloaking device. These metamaterials are designed 
to have specific anisotropic acoustic and electromagnetic properties that divert 
pressure and light waves around a region of space unscathed. 

Anisotropy results in a spreading of the spectrum of the discretized operator, 
with severe consequences on the convergence rate. We illustrate this point with 
a simple Poisson problem 

= p. 

The r.h.s. and boundary conditions are chosen randomly (but compatible, see 
below). The Laplacian operator is discretized with a standard 7-point, second- 
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order stencil, the domain is rectangular with dimensions LxLxH, the domain's 
aspect ratio is measured by i? = L/H, and the discretization is chosen as 
(Ax, Ay, Az) — {H/2, H/2, H/16). For illustration purposes, the problem is 
solved using a 4-level multigrid scheme which employs line relaxation in the 
vertical (stiff) direction, as used by Armenio and Roman 9] to do a LES of a 
shallow coastal area. Figure [T] shows the attenuation factor A = ||ii'„+i||/||i?„|| 
as a function of the aspect ratio, where En is the residual error after n iterations. 
For moderate aspect-ratio domains, the convergence is satisfactory, but as R 
increases, we rapidly approach a point where the method becomes, for practical 
purposes, useless. Similar results (see below) hold for Krylov based methods. 
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Figure 1 : Attenuation factor as a function of domain aspect ratio for a Poisson problem solved 
used a standard multigrid scheme. 

In this paper, we describe how a formal series solution of a Poisson problem 
derived by Scotti and Mitran [lOj], herein referred to as SM, can be used to 
significantly speed up the convergence of traditional iterative schemes. SM in- 
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troduced the concept of grid lepticity, A, to describe the degree of anisotropy of a 
discretized domain and then sought a solution to the Poisson problem written as 
a power series in - the leptic expansion. An apparent limitation of the leptic 
expansion is that it is very efficient only for lepticity larger than a critical value 
of order 1. SM were led to introduce the leptic expansion in order to provide the 
right amount of dispersion needed to balance nonlinear steepening of internal 
waves propagating in a shallow stratified ocean. For this limited purpose, SM 
showed that at most only three terms in the expansion are needed, and thus the 
lack of overall convergence was not a serious limitation. Here, we develop the 
method for the purpose of efficiently calculating solutions of a discretized Pois- 
son problem. In our approach, the lepticity, which in SM's original formulation 
was related to the aspect ratio of the domain, becomes now a generic measure 
of anisotropy. The main result of this paper is that for subcritical values of the 
lepticity, the leptic expansion can still be extremely valuable to dramatically 
increase the convergence rate of standard iterative schemes, as the numerical 
demonstrations of the method will show. The examples are coded using Mat- 
lab and Chombo's BoxTooltQ library with standard second order discretization 
techniques. 

What makes the leptic expansion particularly attractive is that it can be 
parallelized in a very straightforward way, as long as the decomposition of the 
domain does not split along the stiff (vertical, in our examples) direction. For 
comparison, the parallel implementation of the Incomplete Cholesky Decom- 
position of a sparse matrix, which is used as a preconditioner for Conjugate 
Gradient schemes and yields very good con verg ence rates even at high levels of 
anisotropy [ll|, is a highly non-trivial task 12 1. 

Finally, it must be noted that the idea behind the leptic expansion can be 
traced as far back as the work of Bousinnesq on waves [U] . What we have done 
here is to formulate it in a way suitable for numerical calculations. 



^ Chombo has been developed and is being distributed by the Applied Numerical Algorithms 
Group of Lawrence Berkeley National Lab. Website: https://seesar.lbl.gov/ANAG/chombo/ 
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The rest of the paper is organized as fohows: a discretization- and coordinate- 
independent version of the leptic expansion is introduced in Section [21 The 
reader who is not interested in the details may skip directly to Section 12.21 
where a summary of the scheme is provided. Section [3] presents convergence 
estimates of the leptic expansion using Fourier analysis techniques 14]. This is 
where the leptic method's potential to generate initial guesses for conventional 
iterative schemes emerges. In Section 31 we consider some examples to illus- 
trate how the leptic expansion can be used with conventional iterative schemes 
to create very efficient solvers. A final section summarizes the main results. 

1.1. Notation 

As an aid to the following discussion, we define the relevant notation here. 

• Horizontal coordinate directions: — x,x^ — y. Vertical (stiff) coordi- 
nate direction: = z. 

• H — vertical domain extent. L = horizontal domain extent. 

• (f) = full solution. — solution of p"^-order equation. 

• (f)" and (f)^ — solution of vertical and horizontal equations (explained in 
section (2). 

• Summation indicej^ (summed from 1 to 3): 

• Horizontal summation indices (summed from 1 to 2): m,n. 

• u" = {u* ,v* ,'w*y , the flux field that will be used as Neumann boundary 
conditions. 

• p = the source term of the elliptic PDE. 

• (T*^ = a symmetric, contravariant tensor field. 

will use summation notation. Repeated indices imply a sum unless explicitly stated 
otherwise. 
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• V is the S-dimensional domain and dV ~ dx^dx^dx^ . 

• Ai is the boundary of V in the i*^-direction. 

• dAi — {dx^dx^, dx^ dx^ , dx^ dx^)^ , the area element at the boundary of V. 

• 5 is the 2-dimensional horizontal domain with local coordinates (a:^,x^) 
and dS — dx^dx^. 

• dim — {dx^ , dx^)^, the line element around the boundary of S. 

• A{x,y) — -jj J^^ A {x, y, z) dz, the vertical average of A. 

• The leptic ratio is defined as A = Ax/ H , where Ax is the horizontal mesh 
spacing. 

2. Derivation 

The problem we wish to solve is an anisotropic elliptic PDE with Neumann 
boundary conditions of the type that often arises when solving the incompress- 
ible Navier-Stokes equations [ll Q • More precisely, we wish to solve the following 
equation 

d.a'^djcj) = pinV (1) 
a'^dj<j> = u*' on dV, 

where a^^ is a positive-definite, symmetric tensor field. The only restriction on 
p and is that they must be compatible with one another. That is, if we 
integrate eq. ^ over V and apply Stokes' theorem, we obtain an identity that 
must be obeyed by the sources, 

/ pdV= f u*'dAi. (2) 
Jv Jav 

In general, the domain can have any number of dimensions higher than 1, 
but without loss of generality we will restrict ourselves to the 3-dimensional 
case. We will also assume there is a small parameter, e, that we can use to 
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identify terms of the field and operator in a formal perturbation expansion of 
the form 

(j) = 0o + e0i+e202+e^03 + --- (3) 

In this section, we will derive a method to solve eqs. ([1} using the expansion 
©• 

2.1. The desired form of the expansion 

We begin by plugging expansion (jSj into the first of eqs. ([1]) and equating 
powers of e. 

d^cT^^d^cj>o - p (4) 
^sa^'^s^i + (9™a"3a3 + 93a3"9„ + 9,„fT™"9„) 00 = (5) 
93a=^'93(/.2 + (9™a"3a3 + a3a3"a„ + a^a^'a^) 01 = O (6) 

etc . . . 

The first thing to notice is that eq. Q with Neumann boundary conditions 
can only determine 0o up to an additive function of x and y. We will call the 
solution of eq. ([4]) (f)Q{x,y,z) and the still undetermined function (f)Q{x,y). We 
might as well preemptivley write the fields at every order as 

(t>n{x,y,z) = <j>l{x,y,z) + <l>n{x,y) 

so that equations read 

93^3^930^ = P (7) 
^3^339305; + a„,a"3530- + (a3a3"9„ + a„,a""9„) 00 = O (8) 

93fT3'9302 + ^m^^'ft^l + (%<^'"5n + 9m^7""9„) 01 = (9) 

etc . . . 
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In order to solve this set of equations, we must define our boundary condi- 
tions at each order. At 0(1), we will set 



where and z_ denote the evaluation at the upper and lower boundaries, 
respectively. The excess function, wq, is defined at each x and y to make eq. 
([7]) consistent with its boundary conditions. By vertically integrating eq. (O, 
we see that 



/ pdz = a^^d- 



that is, 

Wo = w^l^t - / pdz. 



This completely determines along vertical lines at each x and y. Notice that 
we have not yet chosen the gradients of (/)q normal to the horizontal boundaries. 
We will save this freedom for later. Right now, we need to look at the 0{e) 
equations to get 0q . 

For the moment, let us think of eq. ([5]) as an equation for (\)\. We again 
need to specify vertical boundary conditions. We will define 



cy^'^dz4>\\^ - iSo- fT3"9„0oL (10) 

^33 p\ xv\ ^3n / 



Defining a new excess function is unnecessary because it can just be absorbed 
into the still undetermined function 0q. As before, we vertically integrate eq. 
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J Z- 



= Wo - 

If we divide by i?, the vertical integrals become vertical averages, which will be 
denoted with overbars. When taking these averages, remember that while 0'' 
was defined to be independent of z, no such assumption was made for cr'-'. This 
leaves us with an equation for 0q, 



9m^5„0g = - (11) 
ti 

If (f)Q is chosen to be any solution to this equation, then eq. ([8]) for <j)\ together 
with the boundary conditions (jlOp will be consistent. Now, we define boundary 
conditions for eq. (|TT|) to be 

^^"'^o|^ea5 = ^l^ea5- (12) 

This choice of boundary condition will be made consistent with equation (llip 
in the following steps. First, we integrate eq. (jlip over S and reorganize the 
result. 

\- [ wodS - <f a-^'Wjdpldlra = - I ^dS - [ d^'^^^^^d'^dS 

n Js JdS JS tl Js 



H 



Is JdS Js Js 

iJmO c„ (,'■'' 

Is 



d„^a^-^dn^odS 



as 

v*^dL 



Ids 

Next, we exploit the remaining freedoms in the boundary conditions of 0q by 
choosing (T™-'9j0Q — u*™ — u*™ at the horizontal boundaries of V. This, together 
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with the definition of wq, gives us 




Noting that the third and fourth terms cancel, this simplifies to become 

H (f TP^dl,n+ I w*\l+dS^ I pdV. 
Jas Js Jv 

Now, if this equation holds, then the boundary conditions (1121) will be consistent 

with the horizontal equation ([Tl]) . From equation ([2]), we see that this is indeed 

the case. Therefore, equations (fTTj) and (fT2|) completely determine 0q, and in 

turn, 00- Having (j)Q at our disposal, we can now tidy up eq. ([S]) a bit. 

The first two terms in parentheses in the second line are zero via eq. Since 
this equation along with boudary conditions ()10p are consistent, this completely 
determines (pi at each x and y. There is, as before, another freedom yet to be 
chosen - the gradients of (p^ normal to the horizontal boundaries. Again, we 
will choose them later. 

We continue in the same manner, obtaining an equation for 02 whose Neu- 
mann compatibility condition is the equation for cj)^. At this point, we might 
as well just derive the equations for the general order fields (j)p and <j)p_i where 
p> 2. We start with 

53^''530; + a„a'"3a30-_^ + (a™a""9„ + a3a3"a„) 0p_i = (13) 

and the vertical boundary conditions for <pp 

Vertically averaging eq. (jl3p and rearranging a bit gives us 
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which upon applying boundary conditions and rearranging some more gives us 
the horizontal equation 



dmu-^^dnrp^^ = -9™fT™J9j0;;_i. (14) 

This equation will be compatible with homogeneous boundary conditions if 
we chose the gradients of 0^ at the horizontal boundaries wisely. By integrating 
eq. (dU over S, we find 



= f o-™"(9„0^_id;„ 

JdS 

Js 

= - [ dma"'W,rp-idS 
Js 



las 

It is tempting now to let the gradients of (pp^i for p > 2 be identically zero, but 
a more appropriate choice is 

a"^^d,rp-i = - dn<Pl-2, (15) 

which is equivalent to 



-(T™"a„(/)^_1, p>2. 



This choice will average to zero, satisfying the above integral, as well as prevent 
inconsistencies later. 

Finally, we can clean up eq. to get an equation for 

53^''53<^; =p- dy^d, (00 + </>! + ... + <^p-l) 

This completes the derivation of the needed equations. It may seem like some 
of the boundary conditions were chosen only to be compatible with their cor- 
responding differential equation, when in fact we chose them carefully so that 
the sum of their contributions is it** for each direction, i. At the horizontal 
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boundaries, 

= {u*'"''- - u^) + (TF^) + . . . + (0) + . . . 
at the upper vertical boundary, 

= (w* - wo) + {wo) + . . . + (0) + . . . 

= w\ 

and at the lower vertical boundary, 

= + (0) + . . . + (0) + . . . 

Summary of the expansion 

Tables ([!]) and ([2]) provide a concise overview of the steps involved in gener- 
ating the n*''-order of the expansion in generalized and Cartesian coordinates, 
respectively. The problem is formulated recursively. The left hand side is the 
same at each step. However, it must be noted that the solution of the {D — 1)- 
dimensional Poisson problem can be postponed or in some cases eliminated 
depending on the magnitude of the correction required. This will be discussed 
in section 12.31 For a very simple example problem solved using this expansion, 
see section [31 
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• 

w*\^^--wo, z = z+ 

W*\^ , Z= Z- 




<7'"''''"dn(l>ci\^^^<, = U*"^U^-jo 

\xeds ixeds 


0{e) 


d3cr^^d3<i>l = p- dia'idj<t>o 


'^^Wll,^ = • 


-(T3"a„(/>o^ , 2 = z- 






0{s'n 


dsa^'^d-i^p = P- 9ia'^d, 0, j 






5^(9„<L^„, = 



Table 1: The general form of the expansion. The indices i,j extend over all directions and 
the indices m, n do not include the vertical (thin) direction. The excess function is defined as 
wo = 1^*11+ - pdz. 
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Wo, z = z+ 
0, ^; = 2:_ 







No horizontal equation. 




'1 = 





































No horizontal equation. 











Table 2: The expansion in Cartesian coordinates. The indices i,j extend over all directions 
and the indices m, n do not include the vertical (thin) direction. The excess function is defined 
as wo = w*\z'^ — /^+ pdz and the horizontal Laplacian is = + dy. 



2.3. Eliminating horizontal stages 

Typically, solutions of the horizontal problems are more expensive than so- 
lutions of the vertical problems. Sometimes, we can skip the horizontal stages 
altogether if we know in advance that its solution will not contribute to the 
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overall convergence of the method. For example, suppose 



/ A{x,y) Dix,y) 







a' 



D{x,y) B{x,y) 







V 







C{x,y,z) J 



where A,B,C, and D are arbitrary functions of the variables Usted. We see that 



but since cr™"^ = by assumption, the dm<^"^^ ds4>^ term is zero. Also, since each 
(j)p is the solution of an ordinary differential equation with Neumann BCs, we can 
choose solutions whose vertical averages are zero - eliminating the dmC^^dn4>^ 
term as well. This means that the r.h.s. of the horizontal equations become zero 
for all but the 0(1) stages. Since the boundary conditions are also zero for these 
problems, the solutions, (^p>i, must be; identically zero. Whenever a*-' has this 
form, (/)q is the only horizontal function that needs to be found - eliminating 
most of the computation time. 

In practice, ct'^ is often very close to the form shown above. In fact, many 
useful coordinate systems such as the Cartesian, cylindrical, and spherical sys- 
tems are described by cr*^ = ^/gg'^^ which are exactly of this form. It is helpful 
to consider this while iterating. If we can find a value of P such that all 0p>p 
will not significantly influence the overall convergence, or if we calculate that 
the norm of the horizontal equation's r.h.s. is below some threshold, then we 
can tell the leptic solver to stop performing horizontal solves in the interest of 
computation time. Alternatively, suppose we are about to find 0^. We could 
simply set 0^ to zero everywhere and then set up the next vertical stage. If the 
vertical problem is consistent up to some prescribed tolerance, then wo never 
needed the true sohition of the horizontal equation to begin with! Otherwise, 
we can go back and solve the horizontal equation before moving on to the next 
vertical stage. This is a very economical way of deciding which horizontal solves 
are necessary. 
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2.4- An interpretation of e 

So far, we have derived a set of equations that produce a formal solution 
to the original Poisson problem based on the assumption that the parameter 
e in eqs. ^ properly identifies terms of fundamentally different sizes. The 
procedure does not refer to a specific discretization of the equations, but heuris- 
tically depends on the existence of a small parameter. The latter is typically 
derived from an anisotropy inherent in the problem. This could be due to many 
causes, but to appeal to the interests of the author's own research, we will focus 
on an anisotropy in the geometry of the domain and numerical discretization. 
However, it is easy to generalize the foregoing argument regardless of the actual 
source of anisotropy. Naively, the aspect ratio of the domain could be used to 
define such a parameter. However, the following example shows that it must 
also depend on the details of the discretization. 

Suppose we want to solve the isotropic, 2D Poisson equation in a rectangular 
domain. We will choose a uniform discretization with x Nz cell-centers 
and we will define the aspect ratio to be a = H/ L. Upon switching to the 
dimensionless variables x — x/L and z — z/H, Poisson's equation transforms 
as follows 



We will rescale the field variable, so that it is dimensionless and of O (1), which 
is always possible. Now, apart from an overall scaling of H^^ , it is natural to 
identify the small parameter e with the aspect ratio of the grid. However, a 
little reflection shows that a more "quantitative" definition of e cannot ignore 
the discretization altogether. Indeed, once discretized, the term involving x- 
derivatives is at most ^ ot^N"^ while the term involving z-derivatives is at least 
1. This means that if aN^ <^ 1, then will be fundamentally smaller than 
jf^, and so the "smallness" of e depends on both the aspect ratio of the domain 



P 
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and on the details of its discretization. It follows that, in the continuum limit, 
e cannot be a priori ever guaranteed to be small. 

Scotti and Mitran quantified these results in a more general manner. In 
their paper [l^, they define the grid's leptic ratio, A = min (Ax^/iJ), where 
min Ax™ is the minimum grid spacing in the directions other than the vertical. 
Note that the leptic ratio is controlled by the overall apsect ratio of the domain 
and by the degree of "coarseness" of the discretization in the horizontal direc- 
tions. It is shown that if A > 0(1), then the computational grid is "thin," and 
the summarized equations of the previous section will indeed produce solutions 
whose sum converges to the solution of ([T]). This result extends to more exotic, 
N-dimensional geometries if we identify the symmetric tensor field, cr*^ , with 

At first, it would seem that this method is of very limited practical utility, 
being convergent only on rather coarse grids. Moreover, it it is at odds with 
the very sensible idea that finer grids should lead to better results. However, 
in this article we pursue the idea that even when the leptic ratio of the grid is 
below critical, so that the expansion will not converge, the method can still be 
used to accelerate the convergence of conventional methods. Looesely speaking, 
the idea is that the r.h.s. can be partitioned between a component that can 
be represented on a grid with A > 1 and a remainder which needs a finer grid 
with A < 1. Restricted to the former component, the expansion converges, 
while it diverges on the latter. On the contrary, traditional method such as 
BiCGStab tend to converge fast on the latter, and slow on the former. By 
judiciously blending both methods, we can achieve a uniformly high level of 
convergence. Also, note that since the expansion is formulated analytically, 
it can be implemented regardless of any particular choice of discretization of 
the domain. In the following examples, we will use a second-order scheme on 
a staggered grid, but the method is by no means restricted to this type of 
discretization. 



18 



3. Convergence estimates 

3.1. Restricted case 

The heuristic arguments used earher can be given an analytic justification 
in the case of a simple rectangular geometry. Once again, we limit to two 
dimensions, the extension to higher dimensional spaces being trivial. The elliptic 
equation we wish to solve is 

{dl + dl)4^{x,z)^p{x,z) (16) 

with homogeneous Neumann boundary conditions. Without loss of general- 
ity, we can set p (a;, z) equal to an arbitrary eigenfunction of the operator, 
cos {kx) cos (mz). The exact, analytic solution of eq. ([TS]) becomes simply 

cos {kx) cos (mz) 

4> [X, z) = — . 

fc^ + m-^ 

Now, let's investigate how the leptic solver would have arrived at a solution. 
First, we will write eq. as 

{edl + dl) {(f>o + £01 + . . .) = cos (kx) cos (mz) , (17) 

where the e is only used to identify small terms and will eventually be set to 1. 
Equating various powers of e gives us 

d'^ipo — cos (kx) cos (mz) 
etc . . . 

The solution to the 0(1) equation is 

cos (kx) cos (mz) 

00 (x, z) = 5 . 

— m^ 

The constant of integration, that is 0q (x), is identically zero since the BCs are 
homogeneous and there is no need for an excess function. The O (e) equation 
becomes 

9^01 = ^ cos {kx) cos (mz) 

m^ 
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Continuing in this manner, we see that the solution at O (e") is 




which means that if we terminate the leptic solver at O (e^), the solution we 
arrive at is given by the sum 



where wo have set e to 1. If k"^ /m? ^ 1, then this geometric scries will diverge 
as p — >■ oo and the leptic solver will ultimately fail. On the other hand, if 
k"^ /m? < 1, then the scries will be finite for all values of p and we can put the 
solution in a closed form. 



In the limit p ^ oo, the term in braces tends to 1 and we recover the exact, 
analytic solution of the elliptic equation. 

Since the wavenumbers k and m are both positive, we can simply say that 
convergence of the leptic method requires max(A;/m) < 1. Analytically, this 
quantity depends on the harmonic content of the source, p{x,z), and is, in 
principle, unbounded. Numerically, once a discretization has been chosen, k 
and m are limited to a finite number of values. If we let the source term be a 
general linear combination of eigenfunctions and if our rectangular domain has 
dimensions L by H divided uniformly into elements of size Aa; by Az, and it is 
discretized with a spectral method, then max(fc) = ir/Ax and min(m) = w/H. 
This produces our convergence condition, H/Ax < 1, which is the origin of the 
leptic ratio, A = mm{Ax"^/H), and its square inverse, e = max {H / Ax"^)^ , 
used throughout this paper. 

Notice that this convergence condition is a restriction on how we must dis- 
cretize a given domain, it is not directly a restriction on the source term of the 




(18) 
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elliptic equation at hand. This means that the leptic method should converge 
similarly for all equations that use a particular uniform, rectangular grid. If the 
grid is not rectangular or uniform, then the relavant convergence condition is 
Hi/ lS.Xi < 1 at all grid positions, i. Here, Hi is the vertical height of the domain 
at Xi and Ax^ is the minimum horizontal grid spacing at Xi. 

3.2. General case 

Now, we will extend this argument to the more general case involving the 
positive-definite, symmetric tensor field, cr*^ {x,z). We wish to perform a con- 
vergence analysis on 

{d^a^'d, + e {d^a^^d^ + d^a^'d, + d^a'^d,)} {x, z) ^ p (x, z) . (19) 

Without the exact form of each cr'^ {x, z), we cannot trivially diagonalize the 
operator in (fc, m)-space. We can, however, diagonalize the operator in (x, z)- 
space by performing a small rotation by an angle ^tan"^ ■ This 

casts the equation into the simpler form 

{d, [a^^ + O (e^)] + [ea^^ + O (e^)] d^} <p {x, z) ^ p {x, z) , 

where the x and z now represent the new coordinates. We might as well just 
let cr^^ + O (e^) ^ CT^^ and ecr^^ + O (e^) ^ ecr=^^ so that 

{d^a^'d, + £9,a^"9J </) {x, z) = p {x, z) . (20) 

Even though each term of eq. ([20)) is functionally different than the cor- 
responding terms of eq. (jl7l) . their magnitudes are equal. This means that a 
convergence analysis of eq. ([20| must lead to a restriction of the form A > O (1). 
Rotating back to eq. ()19p cannot possibly change this restriction due to the van- 
ishing size of the rotation angle. This shows that even in the general case of 
eq. (Ill), the leptic solver will converge as long as the discretization is chosen to 
satisfy A > 0(1). 
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3.3. The leptic solver as a preconditioner 

Let us return to the simple case of solving V^c/) — cos (kx) cos (mz) on a rect- 
angular domain. After applying ni iterations of the leptic solver, the residual, 
r, is found via eq. (|18p with p ^ ni — 1. 



cos (kx) cos (mz) — V 



2 



cos (kx) cos (mz) "^-^ / k^ 



p ^2 / ^2 

= COS (kx) cos (mz) — cos (kx) cos (mz) > rr 

n=0 ^ 

COS (kx) COS (mz) 




The last line comes from collapsing the telescoping set of sums. This gives us 
an amplification factor for each eigenmode of the residual. That is, if we are 
given a generic residual and perform an eigenvector expansion, 

r (x, z) — r (fc, m) cos (kx) cos (mz) , 

fc m 

then the magnitudes of the individual components, r(k,m), will be amplified 
or attenuated by k"^ /m? each time we iterate. If the grid is constructed such 
that A > O (1), then r (k,m) will always be attenuated since max (fc^/m^) < 1. 
If, however, the grid's leptic ratio is ^ O (1), then only those eigenmodes with 
k'^/m^ < 1 will diminish and those with k"^ /m^ > 1 will be amplified. In 
(a;, z)-space, this effect appears as a diverging solution, but in (fc, m)-space, 
we can see that the solution is split into converging and diverging parts - we 
are conditioning the solution. For this reason, even though preconditioning is 
normally understood as the action of substituting the original operator with a 
modified one with better spectral properties jisjl , we will use preconditioning to 
mean the action of replacing an initial guess with one that has better spectral 
support. 

As an example, consider a 64 x 64 x 16 grid with Aa; — (1, 1, 0.1). This fixes 
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£ at 2.56, which is large enough to cause problems for the leptic solver|3 In 
order to learn how the solvers are treating the modes on this grid, let's apply 
them to 

= E E E [-r ) ) [-IT ) 

with homogeneous boundary conditions. This is a residual equation whose r.h.s. 
harbors every periodic mode supported by the grid in equal amounts (except 
for the zero modes, which must be removed to be consistent with the boundary 
conditions). We only consider periodic modes to facilitate spectral analysis via 
FFT. In one test, we solved this equation with a BiCGStab solver and in another 
separate test, we used the leptic method. BiCGStab stalled in 26 iterations and 
the leptic solver began to diverge after 24 iterations. Once progress came to a 
halt, we performed an FFT to locate which modes were converging slowest. To 
simplify the visualization, we found the {kx,ky) slice that contained the most 
slowly converging modes (which, consistent with the previous analysis, is the 
smallest value = 27r/iJ). The results are in figures [2] and Isfl 

The color in these plots represent the base-10 logarithm of the Fourier co- 
efficients of each mode. It is apparent that each method has its own distinct 
problem region shown in red. The BiCGStab solver has the most trouble dealing 
with low frequency modes while the leptic solver has trouble with high frequency 
modes. Since the leptic solver produced a residual whose largest modes can eas- 
ily be handled by BiCGStab, we apply the BiCGStab using as initial guess the 
output of the leptic solver after 18 iterations. Now BiCGStab is able to converge 
quickly (figure H]). 

We should mention that this is just an illustration. In this example, the 
BiCGStab method on such a small grid would have converged on its own in 
a reasonable number of iterations. On a larger grid, BiCGStab alone often 

■^For the remainder of this paper, we will be dealing with a geometric anisotropy quantified 

by A. The perturbation parameter is then e = (H/Ax)'^. 

*VisIt has been developed and is being distributed by the Lawrence Livermore National 

Laboratory. Website; https://wci.llnl.gov/codes/visit. 
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Figure 2: A 2-dimensional slice of the residual error after 26 iterations of BiCGStab. Colors 
are on a logarithmic scale. Notice the large error near the center of the plot, indicating 
BiCGStab's difficulty eliminating low frequency errors. The blue lines are the zero-frequency 
modes that must be fixed to agree with the boundary conditions. 

converges too slowly to be a viable solution method and sometimes stalls due 
to the condition number of the operator. Further complications arise when we 
use mapped coordinates because this tends to drive the condition number of the 
operator even higher. In these situations, using the leptic method to generate 
a suitable initial guess becomes quite useful. We will illustrate this effect in 
further detail in section |4l 
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Figure 3: The residual error after 24 iterations of the leptic solver. This solver eliminates 
low frequency errors much more effectively than high frequency errors, indicating the leptic 
solver's potential to serve as a preconditioner for BiCGStab. 

4. Demonstrations 

In this section, we will create a sample problem on various numerical domains 
in order to compare the effectiveness of traditional solvers with methods that 
utilize the leptic solver. Our traditional solver of choice will be the BiCGStab 
method preconditioned with the incomplete Cholesky factorization (IC) of the 
elliptic operator For simplicity, we will use a rectangular domain and the 
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Figure 4: The residual error of the BiCGStab method when given an initial guess generated 
by the leptic solver. 

r.h.s. will be generated by taking the divergence of a vector field 




This vector field also generates the boundary conditions. To compare the solvers, 
we will plot the relative residual as a function of the iteration number. For what 
follows, the vertical and horizontal solves of the leptic solver will each be counted 
each as an iteration. 
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4.I. High leptic ratio - Cartesian coordinates 

First, we set N = (64,64,16) and Ax = (0.1,0.1,0.001). This fixes e at 
0.0256, which has weh within the region where the leptic solver outperforms 
traditional methods. Figure [S] shows the results. The leptic solver is clearly 
the more efficient method. In only 6 iterations, it is able to achieve a relative 
residual of 10^^". We would have needed 170 iterations of the BiCGStab/IC 
solver to obtain a residual error of that magnitude. 
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Figure 5: With e ~ 1/40, the leptic solver is clearly more efficient than BiCGStab. Since we 
are using Cartesian coordinates, the leptic solver only needed to perform one horizontal solve. 



4-2. Borderline cases - Cartesian coordinates 

When e = 0(1), the leptic solver may or may not be the most efficient 
solver. We will denote these situations as borderline cases. In the first borderline 
case, we will bring e to 1 by setting N = (64,64, 10) and Ax = (0.1,0.1,0.01). 
Figure |6] shows that the leptic solver requires approximately 5 times as many 
iterations as it did in the e ~ 0.0256 example to achieve an O (lO^^°) residual 
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error. The BiCGStab/IC method, however, converges a bit more quickly than 
it did in the previous example. It required only 90 iterations to catch up to 
the leptic method. This is emperical evidence of our theoretical assertion - by 
raising e (decreasing the lepticity), the leptic solver becomes less effective and 
the traditional method becomes more effective. In this specific borderline case, 
the leptic solver outperforms the BiCGStab method. 
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Figure 6: The convergence patterns of the leptic and BiCGStab solvers when e = 1 and 
condition number fa 10^'^ . The spikes in the BiCGStab residual are due to restarts. 

By varying Nx and Ny^ we can generate an entire class of grids with the 
same e. For example, if we bring N up to (256, 256, 10), the BiCGStab method 
should not converge as rapidly as before. On the other hand, e is still 1, which 
means the leptic solver should perform almost as well as it did on the 64 x 64 x 10 
grid. This is because most of the leptic solver's convergence relies on the vertical 
solver. This is true in general when the horizontal solver is able to be switched off 
(see section [273| - as the horizontal domain grows, the leptic solver outperforms 
traditional relaxation methods. This effect is shown in figure [71 By comparing 
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figures [5] and [71 we see that unlike tlie leptic metliod, tlie BiCGStab/IC metiiod 
is in fact slowed down by the larger horizontal grid. 

The true value of the leptic solver is illustrated by when we use it to generate 
a suitable initial guess for the BiCGStab/IC solver (see section [?3)) . This initial 
guess has an error that is dominated by high wavenumbers. The BiCGStab 
solver then rapidly removes those errors as shown by the dash-dot curve in 
figure [71 
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Figure 7: The performance of various solution methods with e = 1 and condition number 
^ 10®'*. After a few iterations of the leptic solver, BiCGStab can achieve a fast convergence 
rate. Using a preconditioner such as an incomplete Cholesky decomposition can drive this 
rate even higher. 

As our final isotropic, borderline case, we will set N — (50, 50, 50) and Ax = 
(0.1,0.1,0.004). This is appropriate for a cubic, vertically stratified domain 
and brings e up to 4. The BiCGStab/IC solver does not provide immediate 
convergence and the leptic method would have started to diverge after its third 
iteration, but when we combine the two methods as we did in the last example, 



29 



we see a much more rapid convergence than either method could individuahy 
achieve (figure 
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Figure 8: Performance of the solvers with e = 4 and condition number Ri 10® '^. The leptic 
solver began diverging after it's third iteration, so control was passed to the Krylov solver. 
Again, the leptic solver proves most valuable as a preconditioner for the BiCGStab/IC solver 
when e = (1). 



4-3. High leptic ratio - Mapped coordinates 

When the metric is diagonal, several of the terms in our expansion (sec. 12.21) 
vanish. This means we can remove much of the code to produce a more efficient 
algorithm. This reduced code is what generated the results of the previous 
sections. However, in these simple geometries the value of the leptic expansion 
is somewhat limited because it is normally possible to employ fast direct solvers. 
Not so in the case we consider now, where we apply the full algorithm by 
considering a non-diagonal metric. The metric we will use is routinely employed 
in meteorological and oceanic simulations of flows over non uniform terrain. It 
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maps the physical domain characterized by a variable topography z = h{x,y) 
to a rectangular computational domain. Although any relief may be specified, 
it is sufficient for our illustrative purposes to simply let the depth go from d/2 
to d linearly as x goes from to L, where d <0 (Figure IH]). 





Figure 9: A cross section of the coordinate mapping. The thick line denotes the lower vertical 
boundary. 



We define hi^ - f + ^C, where (x, y, z) ^ ry, • 
the symmetric tensor, a^^ — ^/gg^^ , is given by 



This means that 
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For the next two examples, we will be seeking a simple 
'/'sol , rj, ^)^^°^\~) \ ~ ) 
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solution by setting u** — y/gg^-' dj(t)so\ and p — diu*^ . 

Setting N = (256, 256, 64) and Ax = (0.25, 0.25, 0.0025) gives us e = 0.4096. 
After approximately 200 iterations, the BiCGStab method begins to stall with 
a relative residual of O (lO^^). On the other hand, the leptic solver was able 
to reach a relative residual of 3.637 x 10^^ in only 10.2% of the time before 
terminating at O (e"*) . The reason the leptic method stopped iterating was due 
to inconsistent data given to the vertical solver. In section[2l we showed that at 
each iteration, a 2D poisson problem, namely eq. (jlip . must be handed to a tra- 
ditional solver and if that solver fails, the next vertical stage may have a source 
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term that is not compatible with its boundary conditions. Remarkably, the hor- 
izontal stages failed (abandoned due to stalling) at every order of calculation 
and the leptic solver was still able to out-perform the BiCGStab method. Had 
our decision-making algorithm been altered to abandon the horizontal solver 
after failing the first time, which is reasonable for some problems, we would 
have converged much faster. It should be pointed out that until O (e^) , the 
leptic solver did not diverge, therefore we never needed to transfer control to a 
full 3D BiCGStab solver - which would have been slow. 
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Figure 10: Performance of the leptic and Krylov solvers with an anisotropic ct*^ and e ~ 0.4. 
Since our solver is using Chombo's matrix-free methods (knovirn as shell matrices in some 
popular computing libraries such as PETSc [ifil). the Cholesky decomposition of the elliptic 
operator can not be performed. 



4-. 4- Borderline case - Mapped coordinates 

For this scenario, we set N = (64,64, 10) and Aa; — (0.5,0.5,0.1) giving us 
e = 4. To solve our sample problem on this grid, we let the leptic and BiCGStab 
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solvers work together, iteratively. As soon as one solver begins to stall or diverge, 
control is passed to the other solver. The effectiveness of this algorithm is 
explained in section 13.31 - the leptic solver first reduces low frequency errors 
until high frequency errors begin to dominate the residual, then the BiCGStab 
solver reduces high frequency errors until low frequency errors dominate the 
error. This continues until the residual error has been reduced over the entire 
spectrum supported by the grid. Figure [TT] illustrates the effectiveness of this 
algorithm rather convincingly. 
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Figure 11: The convergence pattern of a hybrid leptic/Krylov method when £ = 4 on an 
anisotropic domain. In this case, neither method would have individually provided rapid 
convergence, but when the methods are combined, we see a very rapid convergence and the 
introduction of another preconditioner (eg. IC) is not necessary. 



5. Discussion 

The leptic method was originally devised as a method to add a physically 
appropriate amount of dispersion when numerically modeling the propagation 
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of nonlinear waves in a dispersive medium. In this paper, we generalized the 
method so it could be used to actually speed up the numerical solution of Poisson 
problems characterised by a high level of anisotropy. The key idea is that 
instead of (or in addition to) preconditioning the operator to achieve overall 
better spectral properties, we precondition the initial guess (or the restarts) to 
achieve better spectral support of the residual by coupling the leptic expansion 
to Krylov methods. However, since the former converges on its own if the 
lepticity of the grid is sufficiently large, it is easy to see that it could be used 
within multigrid methods as well. Namely, when the coarsening reaches a point 
that the lepticity of the grid is below critical, the leptic expansion can be used 
in lieu of the relaxing stage to generate an exact solution at the coarse level. 
This would likely cut down the layers of coarsening. 

In its full generality, this method can be used to solve anisotropic Pois- 
son equations in arbitrary, ZJ-dimensional coordinate systems. In many cases, 
however, numerical analysis is performed using simple, rectangular coordinates 
without an anisotropic tensor, a^^ . This simplification reduces much of the com- 
putation. For these common purposes, we included a summary of the Cartesian 
version of the expansion. Both the general and Cartesian expansions are sum- 
marized in section [2.21 

When implementing the leptic method for numerical work, the computa- 
tional domain should be split in all but the vertical (stiff) dimension. The ver- 
tical ordinary differential equations require a set of integrations - one for each 
point in the horizontal plane. With this domain decomposition, these integra- 
tions are independent of one another and the solutions to the vertical problems 
can be found via embarassingly parallel methods. On the other hand, the so- 
lutions to the horizontal equations cannot be parallelized as trivially, making 
their solutions more costly to arrive at. In light of this, section [^75] was provided 
to discuss when it is appropriate to eliminate these expensive stages. 

The computational cost of a single iteration is of the same order as a precon- 
ditioned step of a Krylov method. The savings are obtained in the faster rate 
of convergence, shown in the examples above over a wide range of anisotropic 
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conditions, as well as the relative ease with which the leptic expansion can be 
parallelized. The best rate of convergence is achieved by using the leptic expan- 
sion at the beginning and every time the convergence rate of the Krylov method 
slows down. We did not attempt to predict a priori after how many steps the 
switch is necessary. If the Poisson problem, as it often happens, is part of a 
larger problem that is solved many times, a mockup problem should be solved 
at the beginning to determine empirically the best switch pattern. 

Adapting the expansion to accept Dirichlet boundary conditions would re- 
quire a new derivation similar to that of section [2l but the job would be much 
simpler since Dirichlet elliptic operators have a trivial null space, thereby elim- 
inating the need to consider compatibility conditions among the second-order 
operators and their boundary conditions. 
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